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GENETICS 



BACKGROUND 

Cancer is a multifactorial disease with 
a striking heterogeneity due to genetic, 
epigenetic and transcriptional changes 
involving a myriad of genes and proteins. 
While these factors are relevant to clin- 
ical prognosis and medical treatment, a 
system's approach is needed to unravel 
the complexities underlying intertwining 
carcinogenesis mechanisms. In particular, 
networks allow for straightforward inte- 
gration of molecular, genetic, clinical, and 
topological features embedded in measur- 
able cancer data. Modeling such data leads 
to an assessment of significant changes in 
conditions which affect the cellular mech- 
anisms, in particular dysregulating them. 
Ultimately, treatment of cancer as a sys- 
tems disease indicates a challenging trans- 
lation from systems biology to systems 
medicine. Markers are key players in can- 
cer, characterized by the reference entity 
(gene, protein, etc.,) and by their indi- 
vidual or composite nature. We aim to 
show that the association of markers with 
detected network modules presents advan- 
tages compared to the consideration of 
individual markers. 

Network complexity can be character- 
ized in many possible ways, and both the 
specific data and the network structure 
represent factors conditioning any pos- 
sible inference approach. The structural 
organization of networks is measurable 
at both local and global network scale. 
Consider for instance node degree and link 
density as a starting point, then move to 
the analysis of degree-degree correlation, 
and finally to the exploration of modular- 
ity (core/community structure). While 



such translation allows for validating 
the presence of non-random network 
dynamics, the role of stochasticity sug- 
gests that a network can be conceived 
as an example of an ensemble of net- 
works with certain structural properties, 
i.e. a sort of example sampled from a net- 
work space. Notably, by focusing on the 
structure of networks, and not on the 
dynamics defined on them, the concept 
of stationarity is simplified by considering 
the fact that despite natural networks arise 
often from non-equilibrium processes, 
the notion of equilibrium investi- 
gated through the previously described 
translation (roughly speaking, single 
nodes — correlated nodes — modules and 
cross-linked modules) can be considered 
an abstraction within a frame in which 
network ensembles are stationary entities 
and each example or component network 
can be seen as a state of the system. 

Markers involve several complex 
phases, such as: discovery, identification, 
and validation. Networks offer an inter- 
esting opportunity with regard to the 
study of markers: they allow to estab- 
lish their relevance as individual entities 
and also as components of a cluster or 
module. Supported by recent literature 
(Dao et al, 2011; Peer and Hacohen, 
2011; Bebek et al, 2012; Wu et al, 2012; 
Ben-Hamo and Efroni, 2013), we hypoth- 
esize that by switching their role, from 
individual to team players, markers may 
provide novel information on cancer, 
especially when studied in a pathway con- 
text. In particular, markers examined at 
a network scale may reveal their systems 
relationships, generating synergistically 



active candidates. This fact is impor- 
tant as it bypasses limitations due to 
low reproducibility between differential 
expression (DE) studies, because of the 
cellular heterogeneity within a tissue, 
genetic heterogeneity among patients, 
and other reasons (Ein-Dor et al., 2005, 

2006) . Chuang et al. (Chuang et al., 

2007) highlight that sub-networks, i.e., 
connected components in a protein inter- 
action network, which are induced by 
markers, show superior reproducibility 
compared to isolated gene markers. Also, 
genes with known breast cancer muta- 
tions may be not detected by DE studies, 
but still play an important role in inter- 
connecting DE genes. Sub-networks were 
detected based on the maximization of the 
mutual information computed between 
the activity scores (averaged normalized 
gene expressions) and the disease status 
(metastasis/non-metastasis). Similarly, 
Lee et al. (2008) computed the activity 
of pathways through a related score cor- 
responding to the activity of the subset 
of genes in each pathway (called CORGs) 
found to better discriminate the disease 
status. Notably, we looked carefully at the 
new generation of pathway enrichment 
tools in the bioinformatics literature, and 
selected for our analyses GeneMANIA 
(Mostafavi et al, 2008; Warde-Farley 
et al., 2010). This tool integrates known 
co-expression, co-localization, pathway, 
protein interaction and genetic interac- 
tion relationships to the DE gene list, 
and predicts from the latter additional 
genes, with the result of strengthening the 
functional enrichment analysis. This inte- 
grative omics approach becomes a binder 
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Table 1 | Single gene marker versus module marker across epigenetic treatments. 
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Top: Gene and Module Markers comparisons under various treatments. 
Mid: Annotation of Conserved moduie mariners between DAC and DAC — TSA treatments. 
Annotation of Conserved module markers between TSA and DAC — TSA treatments. 
Bottom: Examples of annotated module markers specific to co-treatment (DAC + TSA). 

* Listed in bold font in the last column examples of extended genes, i.e., missing in the microarray but found connected in the network by the described method. 
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of a wide range of biological information 
layers in the same meta-network, thus 
including many possible generators of 
marker modules. Inclusion of information 
from non-DE genes is also possible if 
it is important in terms of connectivity, 
and considered into the analysis through 
over-representation or scoring techniques. 

The question addressed in this Opinion 
therefore is: how effective is an inte- 
grative approach and the additional 
inference power made available for our 
understanding of the role of cancer mark- 
ers? In parallel studies that we are con- 
ducting, and whose results are centered on 
treatment-specific profiling and pathway 
annotation, we have performed functional 
enrichment analysis of multi-drug resis- 
tant osteosarcoma (MDR-OS) cells from 
the HosDXRlSO cell line after three epi- 
genetic treatments working against drug 
resistance (Esteller, 2007; Bock, 2012). 
The same data source is used here to 
perform a second-generation analysis, 
following the meta-network approach. 
The mechanism we want to study is in 
the realm of epigenetic therapy, and con- 
sists of a de-methylating agent (5-Aza-dC, 
DAC), a de-acetylating agent (TSA), and a 
treatment combining both. We hypoth- 
esize that our inference approach can 
shed light over the impact on cells of 
single versus combined epigenetic treat- 
ments, by identifying module-specific and 
module-shared markers at systems scale. 

METHODS 

NETWORK AND MODULE MARKER 
GENERATION 

cDNA microarray analysis was performed 
to provide expression measurements of 
1920 genes of MDR-OS cells after the 
three treatments (details in Supplementary 
File Experiment.doc). The gene IDs of 
the DE genes after each treatment were 
fed to the GeneMANIA web tool (http:// 
www.genemania.org), and we explored 
co-expression, genetic interaction, co- 
localization, pathway and physical inter- 
action network data. Edge weighting was 
based on GO-Biological Processes. The 
integrated networks (IN) were gener- 
ated from only the DE genes and by 
adding the 20 most closely related genes, 
according to the algorithm. The pro- 
cedure was repeated for all cell treat- 
ments, forming 6 networks; a common 



hypergeometric over-representation test 
(p-value = 0.05) determined enriched 
pathways and functional modules, and 
those sharing the same group of genes were 
merged into a single group, then used as 
module markers. 

RESULTS AND DISCUSSION 
GENE SETS AND NETWORKS 

The treatment with the de-methylating 
agent DAC produced 57 genes significantly 
up-regulated, and 69 down-regulated. The 
treatment with the de-acetylating agent 
TSA produced 40 genes significantly up- 
regulated, and 68 down-regulated. The 
combined treatment with DAC -|- TSA 
produced 16 genes significantly up- 
regulated, and 46 down-regulated. Those 
gene lists were fed to GeneMANIA. Three 
of the six produced networks were gen- 
erated using the original DE genes (one 
for the DAC treatment, one for TSA, 
and one for the combined DAC -|- TSA), 
while the other three networks were 
generated by extending to additional pre- 
dicted 20 nodes in each network, i.e., 
highly connected genes in the IN (one 
for each treatment type). For the DAC 
treatment, 60% of the genes de novo con- 
nected within the IN were present in 
the microarray, but were not DE genes, 
while the remaining 40% corresponded 
to genes not present in the microar- 
ray (PDGFRB, PIK3R1, MAPK3, EGF, 
CASP9, GSK3B, STAT5A, and MAP2K2). 
Exactly the same situation was observed 
for the TSA treatment, where 40% of the 
added genes were not in the microar- 
ray (PIK3R1, MAPK3, MAP2K2, CHUK, 
PIK3R2, IKBKB, PIK3CB, and TRAF2). 
For the combined treatment, only 5 new 
genes (25%) were added (IL12B, ILIO, 
CHUK, MAPK9, and IKBKB). Intuitively, 
the extension of the network to highly con- 
nected genes in an abstract space of data 
multitude suggests a possible recovery 
of potentially important genes excluded 
from the experiment. In turn, the pre- 
dictive inference approach can generate 
more testable hypotheses centered on the 
possible role of markers. 

MODULE MARKERS 

A module marker is a group of genes with 
some detected properties, beyond their 
simple collection. A module is considered 
"active" if DE genes are included, while 



a more specific characterization involves 
gene connectivity, considering cases of 
network of integrated interaction, path- 
way, co-expression, co-localization, and 
genetic interaction data. Such group of 
genes may form a sub-network involving 
one or multiple pathways, or functional 
modules, over-represented in one or more 
of those. Following this idea, the 126 DE 
genes after DAC treatment yielded 240 
module markers, the 108 DE genes after 
TSA treatment yielded 161 module mark- 
ers, and the 62 DE genes after DAC -|- TSA 
treatment yielded 207 module markers. 
Generating module markers from an IN 
may offer advantages over the use of only 
interactions or pathways; an advantage 
refers to selecting specific or multiple data 
types. A data multitude is represented in 
the Supplementary File Figure.doc (panel 
A), with three out of six module mark- 
ers appearing after DAC treatment, and 
other three after TSA treatment. A com- 
bination of co-expression (emphasized in 
purple), pathway (blue), interaction (red), 
co-localization (gray), predicted inter- 
action (yellow), or genetic interaction 
(green) characterizes the modules, while 
the removal of any specific data type would 
affect the IN's integrity. A limited data 
type variety is behind the Supplementary 
File Figure.doc (panel B): two modules (b 
and f) come from co-expressions, one (a) 
comes mainly from pathways, and the rest 
from both. The other edges (interactions, 
predicted interactions and co-localization) 
only appear in a few cases, implying that 
their removal would not affect the connec- 
tivity of the PIN. 

INTER-TREATMENT COMPARISONS 

Table 1 (top) shows 9 out of 126 genes 
expressed after DAC treatment which are 
also expressed after co-treatment, and 20 
out of 108 genes expressed after TSA 
which are expressed after the co-treatment. 
When comparing module markers gen- 
erated from the DE genes, no pathway 
is conserved between single and com- 
bined treatments. However, things change 
with the extended gene set. The num- 
ber of conserved genes between treat- 
ments slightly increases: 13 out of 126 DE 
genes appeared both in DAC and in co- 
treatment, while 25 out of 108 DE genes 
appeared both in TSA and in co-treatment. 
Three of the four additional genes in the 
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FIGURE 1 I An integrative regulatory map. Integrated interactomic 
relationships are presented, starting from the DE genes derived from the 
cDNA microarray analysis, and considering connectivity with non-DE genes 
(marked in bold font) obtained from the reported network extension and 
functional enrichment tools. The non-DE genes excluded from the 
experiments (labeled without colors) allow to establish systems relationships 
with DE genes (down-regulated after treatment, which appear in green 
frame, and up-regulated after treatment, which appear in red frame), and to 
highlight the biological influence of epigenetic treatments. We identified 
major influences in terms of: activation of osteoblast differentiation and 
apoptotic signaling, and inhibition of cell proliferation, metastasis and 
angiogenesis. In particular, following the described treatments, the 
re-expression of epigenetically silenced key genes re-establishes cellular 
homeostasis throughout mechanisms such as: 1. Osteoblast differentiation 



(i.e., IL-6, IL6SX IGF1, TIMP4, TIMPl, BMP-7, all emerging from combined 
treatment); 2. Drug sensitivity of MDR-OS cells through the re-activation of 
both extrinsic apoptotic signaling (i.e., TNF-1 B, RIPK-3, FAS, FADD, genes 
indicated with red frame and emerging from combined treatment) and 
intrinsic apoptotic signaling (i.e., ALG2, P53, P73, CASPIO, ERCC6, BAX, 
BAD, BNIP-3L, genes indicated with red frame and emerging from DAC and 
TSA single treatments, as illustrated in Table 1); 3. Inhibition of cell 
proliferation, angiogenesis and metastasis by down-regulation of some genes 
(i.e., VEGE MAPK1, C-MYC, REL-A, MMP-2, genes indicated with green 
frame), as a consequence of re-expression of epigenetically modified genes 
(indirect treatment's influence). The integrated approach allows to better 
decipher the complex cellular mechanisms which led the tumor cells to 
acquire the multi-drug resistance phenotype and a pro-survival advantage, 
therefore identifying tumor-specific markers useful to future targeted therapy. 



first case (IL6, RELA, IFNBl) were genes 
expressed in DAC but not in DAC + 
TSA, while the fourth gene (AKTl) was 
not DE. Similarly, RELA was in TSA but 
not in DAC + TSA, while the other four 
genes (FADD, AKTl, CASP8, NFKBl) 
were not expressed in either TSA or co- 
treatment. An interesting result appears 
in column 4. Besides the modest increase 



in conserved genes, results show a sub- 
stantial increment in the number of con- 
served module markers when using IN. 
In this case, 24 module markers are con- 
served in DAC and DAC + TSA, while 26 
module markers are conserved between 
TSA and DAC + TSA (see Supplementary 
Table, part A and B, respectively). This evi- 
dence naturally depends on the types of 



treatments which are available, but opens 
for the possibility of assessing whether 
conserved module can be considered per- 
sistent between treatments of different 
nature. Overall, this approach suggests the 
opportunity of running a certain experi- 
mental design, and through module mark- 
ers it emphasizes whether also indirect 
effects should be accounted through the 
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embedded connectivity. Table 1 (middle) 
shows some interesting conserved module 
markers for both cases, while Table 1 (bot- 
tom) shows module markers specific to 
co-treatment. Notably, the DE genes after 
DAC and TSA present 13 gene mark- 
ers in common, but 42 module mark- 
ers in common under after the extension. 
As an example of integrative analysis, 
Figure 1 reports a map of gene and path- 
way regulation under the influence of the 
described treatments, single and combined 
(Table 1). The Supplementary Table file 
reports lists of gene and marker modules. 

FINAL REMARKS 

Epigenetics implies heritable changes in 
gene expression without involvement of 
DNA sequence. Gene silencing is a com- 
plex biological process which involves 
methylation, and leads to disease devel- 
opment once dysregulated. The high fre- 
quency of epigenetic changes in cancer has 
motivated research into new therapeutic 
approaches aimed to reverse gene silenc- 
ing. DNA methylation inhibitors, together 
with histone deacetylase inhibitors, are 
examples of valid drug targets con- 
ceived toward the re-activation of silenced 
genes. Future avenues include activa- 
tion of single genes by exploiting sin- 
gle agents or also the combination of 
epigenetic drugs, thus emphasizing the 
synergistic activities between DNA methy- 
lation and HDAC inhibitors, and con- 
sidering likely non-specificity in terms of 
gene re-activation. The identification of 
modules at the network scale leads to an 
integrative systems approach which goes 
beyond single marker analysis and exploits 
synergistic marker dynamics in support 
of combinatorial experiments. Our pre- 
liminary results show that the recovery 
of latent connectivity may re-position 



the markers depending on the module- 
integrated biodata multitude and on the 
nature of the edges linking the nodes. 
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